Sparse direct solvers with accelerators over DAG runtimes
نویسندگان
چکیده
The current trend in the high performance computing shows a dramatic increase in the number of cores on the shared memory compute nodes. Algorithms, especially those related to linear algebra, need to be adapted to these new computer architectures in order to be efficient. PASTIX* is a sparse parallel direct solver, that incorporates a dynamic scheduler for strongly hierarchical modern architectures. In this paper, we study the replacement of this internal highly integrated scheduling strategy by two generic runtime frameworks: DAGUE† and STARPU‡. Those runtimes will give the opportunity to execute the factorization tasks graph on emerging computers equipped with accelerators. As for previous work done in dense linear algebra, we present the kernels used for GPU computations inspired by the MAGMA library and the DAG algorithm used with those two runtimes. A comparative study of the performances of the supernodal solver with the three different schedulers is performed on manycore architectures and the improvements obtained with accelerators are presented with the STARPU runtime. These results demonstrate that these DAG runtimes provide uniform programming interfaces to obtain high performance on different architectures on irregular problems as sparse direct factorizations. Key-words: No keywords Sparse direct solvers with accelerators over DAG runtimes Résumé : Pas de résumé Mots-clés : Pas de motclef Sparse direct solvers with accelerators over DAG runtimes 3
منابع مشابه
Multi-Elimination ILU Preconditioners on GPUs
Iterative solvers for sparse linear systems often benefit from using preconditioners. While there are implementations for many iterative methods that leverage the computing power of accelerators, porting the latest developments in preconditioners to accelerators has been challenging. In this paper we develop a selfadaptive multi-elimination preconditioner for graphics processing units (GPUs). T...
متن کاملDomain Decomposition Based High Performance Parallel Computing
The study deals with the parallelization of finite element based Navier-Stokes codes using domain decomposition and state-ofart sparse direct solvers. There has been significant improvement in the performance of sparse direct solvers. Parallel sparse direct solvers are not found to exhibit good scalability. Hence, the parallelization of sparse direct solvers is done using domain decomposition t...
متن کاملA Parallel Sweeping Preconditioner for Heterogeneous 3D Helmholtz Equations
A parallelization of a sweeping preconditioner for 3D Helmholtz equations without large cavities is introduced and benchmarked for several challenging velocity models. The setup and application costs of the sequential preconditioner are shown to be O(γN) and O(γN logN), where γ(ω) denotes the modestly frequency-dependent number of grid points per Perfectly Matched Layer. Several computational a...
متن کاملA Sparse QS-Decomposition for Large Sparse Linear System of Equations
A direct solver for large scale sparse linear system of equations is presented in this paper. As a direct solver, this method is among the most efficient direct solvers available so far with flop count as O(n logn) in one-dimensional situations and O(n) in second dimensional situation. This method has advantages over the existing fast solvers in which it can be used to handle more general situa...
متن کاملSparse Direct Linear Solvers: An Introduction
The minisymposium on sparse direct solvers included 11 talks on the state of the art in this area. The talks covered a wide spectrum of research activities in this area. The papers in this part of the proceedings are expanded, revised, and corrected versions of some the papers that appeared in the CD-ROM proceedings that were distributed at the conference. Not all the talks in the minisymposium...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017